skip to main content


Search for: All records

Creators/Authors contains: "Matuszek, Cynthia"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The proliferation of Large Language Models (LLMs) presents both a critical design challenge and a remarkable opportunity for the field of Human-Robot Interaction (HRI). While the direct deployment of LLMs on interactive robots may be unsuitable for reasons of ethics, safety, and control, LLMs might nevertheless provide a promising baseline technique for many elements of HRI. Specifically, in this position paper, we argue for the use of LLMs as Scarecrows: ‘brainless,’ straw-man black-box modules integrated into robot architectures for the purpose of quickly enabling full-pipeline solutions, much like the use of “Wizard of Oz” (WoZ) and other human-in-the-loop approaches. We explicitly acknowledge that these Scarecrows, rather than providing a satisfying or scientifically complete solution, incorporate a form of the wisdom of the crowd, and, in at least some cases, will ultimately need to be replaced or supplemented by a robust and theoretically motivated solution. We provide examples of how Scarecrows could be used in language-capable robot architectures as useful placeholders, and suggest initial reporting guidelines for authors, mirroring existing guidelines for the use and reporting of WoZ techniques. 
    more » « less
    Free, publicly-accessible full text available January 1, 2025
  2. Over the past decade, the machine learning security community has developed a myriad of defenses for evasion attacks. An under- studied question in that community is: for whom do these defenses defend? This work considers common approaches to defending learned systems and how security defenses result in performance inequities across different sub-populations. We outline appropriate parity metrics for analysis and begin to answer this question through empirical results of the fairness implications of machine learning security methods. We find that many methods that have been proposed can cause direct harm, like false rejection and unequal benefits from robustness training. The framework we propose for measuring defense equality can be applied to robustly trained models, preprocessing-based defenses, and rejection methods. We identify a set of datasets with a user-centered application and a reasonable computational cost suitable for case studies in measuring the equality of defenses. In our case study of speech command recognition, we show how such adversarial training and augmentation have non-equal but complex protections for social subgroups across gender, accent, and age in relation to user coverage. We present a comparison of equality between two rejection-based de- fenses: randomized smoothing and neural rejection, finding randomized smoothing more equitable due to the sampling mechanism for minority groups. This represents the first work examining the disparity in the adversarial robustness in the speech domain and the fairness evaluation of rejection-based defenses. 
    more » « less
    Free, publicly-accessible full text available November 30, 2024
  3. Free, publicly-accessible full text available August 1, 2024
  4. Machine learning models that sense human speech, body placement, and other key features are commonplace in human-robot interaction. However, the deployment of such models in themselves is not without risk. Research in the security of machine learning examines how such models can be exploited and the risks associated with these exploits. Unfortunately, the threat models of risks produced by machine learning security do not incorporate the rich sociotechnical underpinnings of the defenses they propose; as a result, efforts to improve the security of machine learning models may actually increase the difference in performance across different demographic groups, yielding systems that have risk mitigation that work better for one group than another. In this work, we outline why current approaches to machine learning security present DEI concerns for the human-robot interaction community and where there are open areas for collaboration. 
    more » « less
  5. In this paper, we present a shared manipulation task performed both in virtual reality with a simulated robot and in the real world with a physical robot. A collaborative assembly task where the human and robot work together to construct as simple electrical circuit was chosen. While there are platforms available for conducting human robot interactions using virtual reality, there has not been significant work investigating how it can influence human perception of tasks that are typically done in person. We present an overview of the simulation environment used, describe the paired experiment being performed, and finally enumerate a set of design desiderata to be considered when conducting sim2real experiment involving humans in a virtual setting. 
    more » « less
  6. Simulation provides vast benefits for the field of robotics and Human-Robot Interaction (HRI). This study investigates how sensor effects seen in the real domain can be modeled in simulation and what role they play in effective Sim2Real domain transfer for learned perception models. The study considers introducing naive noise approaches such as additive Gaussian and salt and pepper noise as well as data-driven sensor effects models into simulation for representing Microsoft Kinect sensor capabilities and phenomena seen on real world systems. This study quantifies the benefit of multiple approaches to modeling sensor effects in simulation for Sim2Real domain transfer by their object classification improvements in the real domain. User studies are conducted to address hypotheses by training grounded language models in each of the sensor effects modeling cases and evaluated on the robot's interaction capabilities in the real domain. In addition to grounded language performance metrics, user study evaluation includes surveys on the human participant's assessment of the robot's capabilities in the real domain. Results from this pilot study show benefits to modeling sensor noise in simulation for Sim2Real domain transfer. This study also begins to explore the effects that such models have on human-robot interactions. 
    more » « less
  7. Modern robotics heavily relies on machine learning and has a growing need for training data. Advances and commercialization of virtual reality (VR) present an opportunity to use VR as a tool to gather such data for human-robot interactions. We present the Robot Interaction in VR simulator, which allows human participants to interact with simulated robots and environments in real-time. We are particularly interested in spoken interactions between the human and robot, which can be combined with the robot's sensory data for language grounding. To demonstrate the utility of the simulator, we describe a study which investigates whether a user's head pose can serve as a proxy for gaze in a VR object selection task. Participants were asked to describe a series of known objects, providing approximate labels for the focus of attention. We demonstrate that using a concept of gaze derived from head pose can be used to effectively narrow the set of objects that are the target of participants' attention and linguistic descriptions. 
    more » « less
  8. Learning to understand grounded language, which connects natural language to percepts, is a critical research area. Prior work in grounded language acquisition has focused primarily on textual inputs. In this work, we demonstrate the feasibility of performing grounded language acquisition on paired visual percepts and raw speech inputs. This will allow interactions in which language about novel tasks and environments is learned from end-users, reducing dependence on textual inputs and potentially mitigating the effects of demographic bias found in widely available speech recognition systems. We leverage recent work in self-supervised speech representation models and show that learned representations of speech can make language grounding systems more inclusive towards specific groups while maintaining or even increasing general performance. 
    more » « less
  9. Learning to understand grounded language, which connects natural language to percepts, is a critical research area. Prior work in grounded language acquisition has focused primarily on textual inputs. In this work, we demonstrate the feasibility of performing grounded language acquisition on paired visual percepts and raw speech inputs. This will allow interactions in which language about novel tasks and environments is learned from end-users, reducing dependence on textual inputs and potentially mitigating the effects of demographic bias found in widely available speech recognition systems. We leverage recent work in self-supervised speech representation models and show that learned representations of speech can make language grounding systems more inclusive towards specific groups while maintaining or even increasing general performance. 
    more » « less
  10. We propose a learning system in which language is grounded in visual percepts without specific pre-defined categories of terms. We present a unified generative method to acquire a shared semantic/visual embedding that enables the learning of language about a wide range of real-world objects. We evaluate the efficacy of this learning by predicting the semantics of objects and comparing the performance with neural and non-neural inputs. We show that this generative approach exhibits promising results in language grounding without pre-specifying visual categories under low resource settings. Our experiments demonstrate that this approach is generalizable to multilingual, highly varied datasets. 
    more » « less